Transformers are widely used in NLP tasks. However, current approaches to leveraging transformers to understand language expose one weak spot: Number understanding. In some scenarios, numbers frequently occur, especially in semi-structured data like tables. But current approaches to rich-number tasks with transformer-based language models abandon or lose some of the numeracy information - e.g., breaking numbers into sub-word tokens - which leads to many number-related errors. In this paper, we propose the LUNA framework which improves the numerical reasoning and calculation capabilities of transformer-based language models. With the number plugin of NumTok and NumBed, LUNA represents each number as a whole to model input. With number pre-training, including regression loss and model distillation, LUNA bridges the gap between number and vocabulary embeddings. To the best of our knowledge, this is the first work that explicitly injects numeracy capability into language models using Number Plugins. Besides evaluating toy models on toy tasks, we evaluate LUNA on three large-scale transformer models (RoBERTa, BERT, TabBERT) over three different downstream tasks (TATQA, TabFact, CrediTrans), and observe the performances of language models are constantly improved by LUNA. The augmented models also improve the official baseline of TAT-QA (EM: 50.15 -> 59.58) and achieve SOTA performance on CrediTrans (F1 = 86.17).
translated by 谷歌翻译
许多数据分析任务在很大程度上依赖对表的深入了解(多维数据)。在整个任务中,都存在表字段 /列的共同使用的元数据属性。在本文中,我们确定了四个这样的分析元数据:测量/维度二分法,公共场作用,语义场类型和默认聚集函数。尽管这些元数据面临不足的监督信号的挑战,利用现有的知识和理解分布。为了将这些元数据推理为原始表,我们提出了多任务元数据模型,该模型将现场分布和知识图信息融合到预训练的表格模型中。对于模型培训和评估,我们通过使用下游任务的各种智能监督来收集分析元数据的大型语料库(来自私人电子表格和公共表格数据集的〜582K表)。我们的最佳模型的精度= 98%,命中率在TOP-1> 67%,精度> 80%和四个分析元数据推理任务的精度= 88%。它的表现优于基于规则,传统机器学习方法和预训练的表格模型的一系列基线。分析元数据模型被部署在流行的数据分析产品中,帮助下游智能功能,例如Insights挖掘,图表 /枢轴表建议和自然语言QA ...
translated by 谷歌翻译
我们对数据驱动的需求工程,尤其是对用户评论的考虑。这些在线评论是提取新需求和改进请求的丰富信息来源。在这项工作中,我们使用Camembert提供了自动分析,Camembembert是法语中最先进的语言模型。我们从健康与健身领域的三个应用程序中创建了一个由6000个用户评论的多标签分类数据集。结果令人鼓舞,并建议可以自动识别有关新功能请求的评论。数据集可在以下网址获得:https://github.com/jl-wei/apia2022-french-user-reviews-classification-dataset。
translated by 谷歌翻译
蛋白质是人类生命的重要组成部分,其结构对于功能和机制分析很重要。最近的工作表明了AI驱动方法对蛋白质结构预测的潜力。但是,新模型的开发受到数据集和基准测试培训程序的限制。据我们所知,现有的开源数据集远不足以满足现代蛋白质序列相关研究的需求。为了解决这个问题,我们介绍了具有高覆盖率和多样性的第一个百万级蛋白质结构预测数据集,称为PSP。该数据集由570K真实结构序列(10TB)和745K互补蒸馏序列(15TB)组成。此外,我们还提供了该数据集上SOTA蛋白结构预测模型的基准测试训练程序。我们通过参与客串比赛验证该数据集的实用程序进行培训,我们的模特赢得了第一名。我们希望我们的PSP数据集以及培训基准能够为AI驱动的蛋白质相关研究提供更广泛的AI/生物学研究人员社区。
translated by 谷歌翻译
图形神经网络(GNNS)在各种现实世界应用中取得了有希望的性能。然而,最近的研究表明,GNN易受对抗性发作的影响。在本文中,我们研究了关于图表 - 图 - 图注射攻击(GIA)的最近引入的现实攻击情景。在GIA场景中,对手无法修改输入图的现有链路结构和节点属性,而是通过将逆势节点注入到它中来执行攻击。我们对GIA环境下GNN的拓扑脆弱性分析,基于该拓扑结构,我们提出了用于有效注射攻击的拓扑缺陷图注射攻击(TDGIA)。 TDGIA首先介绍了拓扑有缺陷的边缘选择策略,可以选择与注入的原始节点连接。然后,它设计平滑功能优化目标,以生成注入节点的功能。大规模数据集的广泛实验表明,TDGIA可以一致而明显优于攻击数十个防御GNN模型中的各种攻击基线。值得注意的是,来自TDGIA的目标GNNS上的性能下降比KDD-CUP 2020上的数百个提交所带来的最佳攻击解决方案所带来的损坏多于两倍。
translated by 谷歌翻译
Open peer review is a growing trend in academic publications. Public access to peer review data can benefit both the academic and publishing communities. It also serves as a great support to studies on review comment generation and further to the realization of automated scholarly paper review. However, most of the existing peer review datasets do not provide data that cover the whole peer review process. Apart from this, their data are not diversified enough as they are mainly collected from the field of computer science. These two drawbacks of the currently available peer review datasets need to be addressed to unlock more opportunities for related studies. In response to this problem, we construct MOPRD, a multidisciplinary open peer review dataset. This dataset consists of paper metadata, multiple version manuscripts, review comments, meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we design a modular guided review comment generation method based on MOPRD. Experiments show that our method delivers better performance indicated by both automatic metrics and human evaluation. We also explore other potential applications of MOPRD, including meta-review generation, editorial decision prediction, author rebuttal generation, and scientometric analysis. MOPRD is a strong endorsement for further studies in peer review-related research and other applications.
translated by 谷歌翻译
Recent studies reveal that deep neural network (DNN) based object detectors are vulnerable to adversarial attacks in the form of adding the perturbation to the images, leading to the wrong output of object detectors. Most current existing works focus on generating perturbed images, also called adversarial examples, to fool object detectors. Though the generated adversarial examples themselves can remain a certain naturalness, most of them can still be easily observed by human eyes, which limits their further application in the real world. To alleviate this problem, we propose a differential evolution based dual adversarial camouflage (DE_DAC) method, composed of two stages to fool human eyes and object detectors simultaneously. Specifically, we try to obtain the camouflage texture, which can be rendered over the surface of the object. In the first stage, we optimize the global texture to minimize the discrepancy between the rendered object and the scene images, making human eyes difficult to distinguish. In the second stage, we design three loss functions to optimize the local texture, making object detectors ineffective. In addition, we introduce the differential evolution algorithm to search for the near-optimal areas of the object to attack, improving the adversarial performance under certain attack area limitations. Besides, we also study the performance of adaptive DE_DAC, which can be adapted to the environment. Experiments show that our proposed method could obtain a good trade-off between the fooling human eyes and object detectors under multiple specific scenes and objects.
translated by 谷歌翻译
With the development and progress of science and technology, the Internet of Things(IoT) has gradually entered people's lives, bringing great convenience to our lives and improving people's work efficiency. Specifically, the IoT can replace humans in jobs that they cannot perform. As a new type of IoT vehicle, the current status and trend of research on Unmanned Aerial Vehicle(UAV) is gratifying, and the development prospect is very promising. However, privacy and communication are still very serious issues in drone applications. This is because most drones still use centralized cloud-based data processing, which may lead to leakage of data collected by drones. At the same time, the large amount of data collected by drones may incur greater communication overhead when transferred to the cloud. Federated learning as a means of privacy protection can effectively solve the above two problems. However, federated learning when applied to UAV networks also needs to consider the heterogeneity of data, which is caused by regional differences in UAV regulation. In response, this paper proposes a new algorithm FedBA to optimize the global model and solves the data heterogeneity problem. In addition, we apply the algorithm to some real datasets, and the experimental results show that the algorithm outperforms other algorithms and improves the accuracy of the local model for UAVs.
translated by 谷歌翻译
源代码对于研究人员重现方法并复制人工智能(AI)论文的结果至关重要。一些组织和研究人员手动收集具有可用源代码的AI论文,以对AI社区做出贡献。但是,手动收集是一项劳动密集型且耗时的任务。为了解决此问题,我们提出了一种方法,可以自动识别具有可用源代码的论文并提取其源代码存储库URL。通过这种方法,我们发现,从2010年到2019年发布的10个最高AI会议的常规论文中有20.5%被确定为具有可用源代码的论文,并且这些源代码存储库中有8.1%不再可访问。我们还创建了XMU NLP Lab ReadMe数据集,这是用于源代码文档研究的标记已读数文件的最大数据集。通过此数据集,我们发现了很多读书文件没有提供的安装说明或使用教程。此外,对AI会议论文的源代码的一般图片进行了大规模的综合统计分析。提出的解决方案还可以超越AI会议论文,以分析来自期刊和会议的其他科学论文,以阐明更多领域。
translated by 谷歌翻译
语义本地化(SELO)是指使用语义信息(例如文本)在大规模遥感(RS)图像中获得最相关位置的任务。作为基于跨模式检索的新兴任务,Selo仅使用字幕级注释来实现语义级检索,这表明了其在统一下游任务方面的巨大潜力。尽管Selo已连续执行,但目前没有系统地探索并分析了这一紧急方向。在本文中,我们彻底研究了这一领域,并根据指标和测试数据提供了完整的基准,以推进SELO任务。首先,基于此任务的特征,我们提出了多个判别评估指标来量化SELO任务的性能。设计的显着面积比例,注意力转移距离和离散的注意距离可用于评估从像素级别和区域级别中产生的SELO图。接下来,为了为SELO任务提供标准评估数据,我们为多样化的,多语义的,多目标语义定位测试集(AIR-SLT)贡献。 AIR-SLT由22个大型RS图像和59个具有不同语义的测试用例组成,旨在为检索模型提供全面的评估。最后,我们详细分析了RS跨模式检索模型的SELO性能,探索不同变量对此任务的影响,并为SELO任务提供了完整的基准测试。我们还建立了一个新的范式来引用RS表达理解,并通过将其与检测和道路提取等任务相结合,证明了Selo在语义中的巨大优势。拟议的评估指标,语义本地化测试集和相应的脚本已在github.com/xiaoyuan1996/semanticlocalizationmetrics上访问。
translated by 谷歌翻译